在本文中,我们推出了一种新的通用依赖树木库,用于亚马逊尼亚的一种濒危语言:秘鲁在秘鲁说的Panoan语言Kakataibo。我们首先讨论实施的协作方法,事实证明,在本科生的计算语言课程的背景下创建树库有效。然后,我们描述了树库的一般细节以及针对拟议的注释实施的特定于语言的注意事项。我们最终对词性标记和句法依赖性解析进行了一些实验。我们专注于单语和转移学习设置,在这里我们研究了另一种Panoan语言资源的Shipibo-Konibo Treebos的影响。
translated by 谷歌翻译
We present a Machine Learning (ML) study case to illustrate the challenges of clinical translation for a real-time AI-empowered echocardiography system with data of ICU patients in LMICs. Such ML case study includes data preparation, curation and labelling from 2D Ultrasound videos of 31 ICU patients in LMICs and model selection, validation and deployment of three thinner neural networks to classify apical four-chamber view. Results of the ML heuristics showed the promising implementation, validation and application of thinner networks to classify 4CV with limited datasets. We conclude this work mentioning the need for (a) datasets to improve diversity of demographics, diseases, and (b) the need of further investigations of thinner models to be run and implemented in low-cost hardware to be clinically translated in the ICU in LMICs. The code and other resources to reproduce this work are available at https://github.com/vital-ultrasound/ai-assisted-echocardiography-for-low-resource-countries.
translated by 谷歌翻译
In this paper we present TruFor, a forensic framework that can be applied to a large variety of image manipulation methods, from classic cheapfakes to more recent manipulations based on deep learning. We rely on the extraction of both high-level and low-level traces through a transformer-based fusion architecture that combines the RGB image and a learned noise-sensitive fingerprint. The latter learns to embed the artifacts related to the camera internal and external processing by training only on real data in a self-supervised manner. Forgeries are detected as deviations from the expected regular pattern that characterizes each pristine image. Looking for anomalies makes the approach able to robustly detect a variety of local manipulations, ensuring generalization. In addition to a pixel-level localization map and a whole-image integrity score, our approach outputs a reliability map that highlights areas where localization predictions may be error-prone. This is particularly important in forensic applications in order to reduce false alarms and allow for a large scale analysis. Extensive experiments on several datasets show that our method is able to reliably detect and localize both cheapfakes and deepfakes manipulations outperforming state-of-the-art works. Code will be publicly available at https://grip-unina.github.io/TruFor/
translated by 谷歌翻译
Modelling the temperature of Electric Vehicle (EV) batteries is a fundamental task of EV manufacturing. Extreme temperatures in the battery packs can affect their longevity and power output. Although theoretical models exist for describing heat transfer in battery packs, they are computationally expensive to simulate. Furthermore, it is difficult to acquire data measurements from within the battery cell. In this work, we propose a data-driven surrogate model (LiFe-net) that uses readily accessible driving diagnostics for battery temperature estimation to overcome these limitations. This model incorporates Neural Operators with a traditional numerical integration scheme to estimate the temperature evolution. Moreover, we propose two further variations of the baseline model: LiFe-net trained with a regulariser and LiFe-net trained with time stability loss. We compared these models in terms of generalization error on test data. The results showed that LiFe-net trained with time stability loss outperforms the other two models and can estimate the temperature evolution on unseen data with a relative error of 2.77 % on average.
translated by 谷歌翻译
We present edBB-Demo, a demonstrator of an AI-powered research platform for student monitoring in remote education. The edBB platform aims to study the challenges associated to user recognition and behavior understanding in digital platforms. This platform has been developed for data collection, acquiring signals from a variety of sensors including keyboard, mouse, webcam, microphone, smartwatch, and an Electroencephalography band. The information captured from the sensors during the student sessions is modelled in a multimodal learning framework. The demonstrator includes: i) Biometric user authentication in an unsupervised environment; ii) Human action recognition based on remote video analysis; iii) Heart rate estimation from webcam video; and iv) Attention level estimation from facial expression analysis.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
The class of bigraphical lasso algorithms (and, more broadly, 'tensor'-graphical lasso algorithms) has been used to estimate dependency structures within matrix and tensor data. However, all current methods to do so take prohibitively long on modestly sized datasets. We present a novel tensor-graphical lasso algorithm that analytically estimates the dependency structure, unlike its iterative predecessors. This provides a speedup of multiple orders of magnitude, allowing this class of algorithms to be used on large, real-world datasets.
translated by 谷歌翻译
得益于深度学习的最新进展,如今存在复杂的生成工具,这些工具产生了极其现实的综合语音。但是,这种工具的恶意使用是可能的,有可能对我们的社会构成严重威胁。因此,合成语音检测已成为一个紧迫的研究主题,最近提出了各种各样的检测方法。不幸的是,它们几乎没有概括为在训练阶段从未见过的工具产生的合成音频,这使他们不适合面对现实世界的情况。在这项工作中,我们旨在通过提出一种仅利用说话者的生物特征的新检测方法来克服这个问题,而无需提及特定的操纵。由于仅在实际数据上对检测器进行训练,因此可以自动确保概括。建议的方法可以基于现成的扬声器验证工具实现。我们在三个流行的测试集上测试了几种这样的解决方案,从而获得了良好的性能,高概括能力和高度鲁棒性。
translated by 谷歌翻译
培训低级的深层神经网络,即使用分解层,特别是社区感兴趣的:它在记忆消耗和训练时间方面提供了对未分离培训的效率。先前的工作集中在预训练的网络的低级近似值和低级空间中的培训中,并提供了其他目标,为所选实践提供了各种临时解释。我们分析了在实践中运作良好的技术,并通过对诸如GPT2之类的模型进行广泛的消融,我们提供了证据表明该领域的共同信念,这暗示着令人兴奋的研究机会仍然需要回答。
translated by 谷歌翻译
人类利用先验知识来描述图像,并能够使其解释适应特定的上下文信息,即使在上下文信息和图像不匹配时,也可以在发明合理的解释的范围内。在这项工作中,我们提出了通过整合上下文知识来字幕Wikipedia图像的新颖任务。具体而言,我们制作的模型共同推理了Wikipedia文章,Wikimedia图像及其相关描述以产生上下文化的标题。特别是,可以使用类似的Wikimedia图像来说明不同的文章,并且所产生的标题需要适应特定的上下文,因此使我们能够探索模型的限制以调整标题为不同的上下文信息。该领域中的一个特殊挑战性的任务是处理量不多的单词和命名实体。为了解决这个问题,我们提出了一个预训练目标,掩盖了命名实体建模(MNEM),并表明与基线模型相比,此借口任务可以改善。此外,我们验证了Wikipedia中使用MNEM目标预先训练的模型可以很好地推广到新闻字幕数据集。此外,我们根据字幕任务的难度定义了两种不同的测试拆分。我们提供有关每种方式的作用和重要性的见解,并突出我们模型的局限性。接受时,代码,模型和数据拆分可公开可用。
translated by 谷歌翻译